-
Notifications
You must be signed in to change notification settings - Fork 552
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Proposal] Adds precision, recall, and F1 score to evaluate detections #4644
base: develop
Are you sure you want to change the base?
Conversation
WalkthroughThe changes enhance the Changes
Poem
Thank you for using CodeRabbit. We offer it for free to the OSS community and would appreciate your support in helping us grow. If you find it useful, would you consider giving us a shout-out on your favorite social media? TipsChatThere are 3 ways to chat with CodeRabbit:
Note: Be mindful of the bot's finite context window. It's strongly recommended to break down tasks such as reading entire modules into smaller chunks. For a focused discussion, use review comments to chat about specific files and their changes, instead of using the PR comments. CodeRabbit Commands (invoked as PR comments)
Additionally, you can add CodeRabbit Configuration File (
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 0
Review details
Configuration used: .coderabbit.yaml
Review profile: CHILL
Files selected for processing (1)
- fiftyone/utils/eval/detection.py (8 hunks)
Additional comments not posted (7)
fiftyone/utils/eval/detection.py (7)
82-100
: Docstring update for new metrics looks good.The docstring has been updated to include precision, recall, and F1-score, which aligns with the new functionality.
191-193
: Field names for new metrics are correctly defined.The new fields for precision, recall, and F1-score are well-named and consistent with existing field naming conventions.
223-236
: Correct calculation of precision, recall, and F1-score for frames.The calculations for precision, recall, and F1-score are correctly implemented with checks to avoid division by zero.
243-256
: Correct calculation of precision, recall, and F1-score for samples.The calculations for precision, recall, and F1-score are correctly implemented with checks to avoid division by zero.
353-370
: Registration of new fields for precision, recall, and F1-score is correct.The new fields are correctly registered for both samples and frames, ensuring they are integrated into the dataset schema.
Line range hint
480-501
: Inclusion of new metrics inget_fields
is correct.The new metrics are correctly included in the field list, ensuring they are handled during data processing.
532-534
: Cleanup process for new metrics is correctly implemented.The new metrics are correctly included in the cleanup process, ensuring they are properly removed.
Documenting a discussion we had offline about this: P/R/F1 are not net-new from a data model standpoint; they can be computed from existing fields as shown below. That said, there's definitely some value to populating these fields if users prefer the ability to filter in the sidebar by P/R/F1 rather than raw counts. On the other hand, this adds to the number of fields + dataset size, especially if you're running multiple evaluations on one dataset. Ultimately, since we're building a model evaluation panel that will directly expose the ability to plot + filter by P/R/F1, I'd suggest that we don't need to duplicate them on the dataset. import numpy as np
import fiftyone as fo
import fiftyone.zoo as foz
from fiftyone import ViewField as F
dataset = foz.load_zoo_dataset("quickstart")
results = dataset.evaluate_detections("predictions", eval_key="eval")
print(results.report())
# Histograms
tp = np.array(dataset.values("eval_tp"))
fp = np.array(dataset.values("eval_fp"))
fn = np.array(dataset.values("eval_fn"))
p = tp / (tp + fp)
r = tp / (tp + fn)
f1 = 2 * (p x r) / (p + r)
# Callbacks
tp = F("eval_tp")
fp = F("eval_fp")
fn = F("eval_fn")
p = tp / (tp + fp)
r = tp / (tp + fn)
f1 = 2 * (p x r) / (p + r)
view1 = dataset.match((p > 0.1) & (p < 0.5))
view2 = dataset.match((r > 0.1) & (r < 0.5))
view3 = dataset.match((f1 > 0.1) & (f1 < 0.5)) |
What changes are proposed in this pull request?
Adds precision, recall, and f1-score to samples and frames after running
evaluate_detections()
.How is this patch tested? If it is not, please explain why.
Release Notes
Is this a user-facing change that should be mentioned in the release notes?
notes for FiftyOne users.
(Details in 1-2 sentences. You can just refer to another PR with a description
if this PR is part of a larger change.)
What areas of FiftyOne does this PR affect?
fiftyone
Python library changesSummary by CodeRabbit
New Features
Bug Fixes
Documentation